Overview
Brought to you by YData
Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 36,457 |
| Missing cells | 11,323 |
| Missing cells (%) | 1.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 21.3 MiB |
| Average record size in memory | 612.3 B |
Variable types
| Numeric | 7 |
|---|---|
| Categorical | 11 |
| Boolean | 2 |
FLAG_MOBIL has constant value "1" | Constant |
CNT_CHILDREN is highly overall correlated with CNT_FAM_MEMBERS | High correlation |
CNT_FAM_MEMBERS is highly overall correlated with CNT_CHILDREN | High correlation |
CODE_GENDER is highly overall correlated with OCCUPATION_TYPE | High correlation |
DAYS_EMPLOYED is highly overall correlated with NAME_INCOME_TYPE and 1 other fields | High correlation |
NAME_INCOME_TYPE is highly overall correlated with DAYS_EMPLOYED | High correlation |
OCCUPATION_TYPE is highly overall correlated with CODE_GENDER and 1 other fields | High correlation |
NAME_EDUCATION_TYPE is highly imbalanced (50.6%) | Imbalance |
NAME_HOUSING_TYPE is highly imbalanced (73.1%) | Imbalance |
FLAG_EMAIL is highly imbalanced (56.4%) | Imbalance |
is_high_risk is highly imbalanced (91.6%) | Imbalance |
OCCUPATION_TYPE has 11323 (31.1%) missing values | Missing |
ID has unique values | Unique |
CNT_CHILDREN has 25201 (69.1%) zeros | Zeros |
Reproduction
| Analysis started | 2025-05-10 10:34:41.840313 |
|---|---|
| Analysis finished | 2025-05-10 10:34:50.493082 |
| Duration | 8.65 seconds |
| Software version | ydata-profiling vv4.16.1 |
| Download configuration | config.json |
Variables
ID
Real number (ℝ)
Unique 
| Distinct | 36457 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5078227 |
| Minimum | 5008804 |
|---|---|
| Maximum | 5150487 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.9 KiB |
Quantile statistics
| Minimum | 5008804 |
|---|---|
| 5-th percentile | 5018456.6 |
| Q1 | 5042028 |
| median | 5074614 |
| Q3 | 5115396 |
| 95-th percentile | 5146024.2 |
| Maximum | 5150487 |
| Range | 141683 |
| Interquartile range (IQR) | 73368 |
Descriptive statistics
| Standard deviation | 41875.241 |
|---|---|
| Coefficient of variation (CV) | 0.0082460356 |
| Kurtosis | -1.2126137 |
| Mean | 5078227 |
| Median Absolute Deviation (MAD) | 38093 |
| Skewness | 0.08624229 |
| Sum | 1.8513692 × 1011 |
| Variance | 1.7535358 × 109 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5008804 | 1 | < 0.1% |
| 5096993 | 1 | < 0.1% |
| 5096983 | 1 | < 0.1% |
| 5096987 | 1 | < 0.1% |
| 5096988 | 1 | < 0.1% |
| 5096990 | 1 | < 0.1% |
| 5096991 | 1 | < 0.1% |
| 5096992 | 1 | < 0.1% |
| 5096994 | 1 | < 0.1% |
| 5096978 | 1 | < 0.1% |
| Other values (36447) | 36447 |
| Value | Count | Frequency (%) |
| 5008804 | 1 | |
| 5008805 | 1 | |
| 5008806 | 1 | |
| 5008808 | 1 | |
| 5008809 | 1 | |
| 5008810 | 1 | |
| 5008811 | 1 | |
| 5008812 | 1 | |
| 5008813 | 1 | |
| 5008814 | 1 |
| Value | Count | Frequency (%) |
| 5150487 | 1 | |
| 5150485 | 1 | |
| 5150484 | 1 | |
| 5150483 | 1 | |
| 5150482 | 1 | |
| 5150481 | 1 | |
| 5150480 | 1 | |
| 5150479 | 1 | |
| 5150478 | 1 | |
| 5150477 | 1 |
CODE_GENDER
Categorical
High correlation 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| F | |
|---|---|
| M |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | M |
| 3rd row | M |
| 4th row | F |
| 5th row | F |
Common Values
| Value | Count | Frequency (%) |
| F | 24430 | |
| M | 12027 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| f | 24430 | |
| m | 12027 |
Most occurring characters
| Value | Count | Frequency (%) |
| F | 24430 | |
| M | 12027 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 36457 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 24430 | |
| M | 12027 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 36457 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 24430 | |
| M | 12027 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 36457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| F | 24430 | |
| M | 12027 |
| Value | Count | Frequency (%) |
| False | 22614 | |
| True | 13843 |
FLAG_OWN_REALTY
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 35.7 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 24506 | |
| False | 11951 |
CNT_CHILDREN
Real number (ℝ)
High correlation  Zeros 
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.43031517 |
| Minimum | 0 |
|---|---|
| Maximum | 19 |
| Zeros | 25201 |
| Zeros (%) | 69.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 19 |
| Range | 19 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.7423669 |
|---|---|
| Coefficient of variation (CV) | 1.7251702 |
| Kurtosis | 22.562434 |
| Mean | 0.43031517 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.5693822 |
| Sum | 15688 |
| Variance | 0.55110862 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 25201 | |
| 1 | 7492 | 20.6% |
| 2 | 3256 | 8.9% |
| 3 | 419 | 1.1% |
| 4 | 63 | 0.2% |
| 5 | 20 | 0.1% |
| 14 | 3 | < 0.1% |
| 7 | 2 | < 0.1% |
| 19 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 25201 | |
| 1 | 7492 | 20.6% |
| 2 | 3256 | 8.9% |
| 3 | 419 | 1.1% |
| 4 | 63 | 0.2% |
| 5 | 20 | 0.1% |
| 7 | 2 | < 0.1% |
| 14 | 3 | < 0.1% |
| 19 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 19 | 1 | < 0.1% |
| 14 | 3 | < 0.1% |
| 7 | 2 | < 0.1% |
| 5 | 20 | 0.1% |
| 4 | 63 | 0.2% |
| 3 | 419 | 1.1% |
| 2 | 3256 | 8.9% |
| 1 | 7492 | 20.6% |
| 0 | 25201 |
AMT_INCOME_TOTAL
Real number (ℝ)
| Distinct | 265 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 186685.74 |
| Minimum | 27000 |
|---|---|
| Maximum | 1575000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.9 KiB |
Quantile statistics
| Minimum | 27000 |
|---|---|
| 5-th percentile | 76500 |
| Q1 | 121500 |
| median | 157500 |
| Q3 | 225000 |
| 95-th percentile | 360000 |
| Maximum | 1575000 |
| Range | 1548000 |
| Interquartile range (IQR) | 103500 |
Descriptive statistics
| Standard deviation | 101789.23 |
|---|---|
| Coefficient of variation (CV) | 0.54524373 |
| Kurtosis | 17.598084 |
| Mean | 186685.74 |
| Median Absolute Deviation (MAD) | 45000 |
| Skewness | 2.7390099 |
| Sum | 6.8060019 × 109 |
| Variance | 1.0361047 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 135000 | 4309 | 11.8% |
| 180000 | 3097 | 8.5% |
| 157500 | 3089 | 8.5% |
| 112500 | 2956 | 8.1% |
| 225000 | 2926 | 8.0% |
| 202500 | 2192 | 6.0% |
| 90000 | 1769 | 4.9% |
| 270000 | 1675 | 4.6% |
| 315000 | 1001 | 2.7% |
| 67500 | 873 | 2.4% |
| Other values (255) | 12570 |
| Value | Count | Frequency (%) |
| 27000 | 3 | < 0.1% |
| 29250 | 7 | |
| 30150 | 3 | < 0.1% |
| 31500 | 16 | |
| 31531.5 | 3 | < 0.1% |
| 31950 | 1 | < 0.1% |
| 32400 | 5 | < 0.1% |
| 33300 | 10 | |
| 33750 | 1 | < 0.1% |
| 36000 | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| 1575000 | 8 | < 0.1% |
| 1350000 | 6 | < 0.1% |
| 1125000 | 3 | < 0.1% |
| 990000 | 4 | < 0.1% |
| 945000 | 4 | < 0.1% |
| 900000 | 39 | |
| 810000 | 15 | < 0.1% |
| 787500 | 5 | < 0.1% |
| 765000 | 9 | < 0.1% |
| 742500 | 5 | < 0.1% |
NAME_INCOME_TYPE
Categorical
High correlation 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.4 MiB |
| Working | |
|---|---|
| Commercial associate | |
| Pensioner | |
| State servant | |
| Student | 11 |
Length
| Max length | 20 |
|---|---|
| Median length | 7 |
| Mean length | 10.856159 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Working |
|---|---|
| 2nd row | Working |
| 3rd row | Working |
| 4th row | Commercial associate |
| 5th row | Commercial associate |
Common Values
| Value | Count | Frequency (%) |
| Working | 18819 | |
| Commercial associate | 8490 | |
| Pensioner | 6152 | 16.9% |
| State servant | 2985 | 8.2% |
| Student | 11 | < 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| working | 18819 | |
| commercial | 8490 | |
| associate | 8490 | |
| pensioner | 6152 | 12.8% |
| state | 2985 | 6.2% |
| servant | 2985 | 6.2% |
| student | 11 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 41951 | |
| o | 41951 | |
| r | 36446 | 9.2% |
| e | 35265 | 8.9% |
| n | 34119 | 8.6% |
| a | 31440 | 7.9% |
| s | 26117 | 6.6% |
| W | 18819 | 4.8% |
| k | 18819 | 4.8% |
| g | 18819 | 4.8% |
| Other values (11) | 92037 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 347851 | |
| Uppercase Letter | 36457 | 9.2% |
| Space Separator | 11475 | 2.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| i | 41951 | |
| o | 41951 | |
| r | 36446 | |
| e | 35265 | |
| n | 34119 | |
| a | 31440 | |
| s | 26117 | |
| k | 18819 | |
| g | 18819 | |
| t | 17467 | 5.0% |
| Other values (6) | 45457 |
Uppercase Letter
| Value | Count | Frequency (%) |
| W | 18819 | |
| C | 8490 | |
| P | 6152 | 16.9% |
| S | 2996 | 8.2% |
Space Separator
| Value | Count | Frequency (%) |
| 11475 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 384308 | |
| Common | 11475 | 2.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| i | 41951 | |
| o | 41951 | |
| r | 36446 | |
| e | 35265 | |
| n | 34119 | |
| a | 31440 | 8.2% |
| s | 26117 | 6.8% |
| W | 18819 | 4.9% |
| k | 18819 | 4.9% |
| g | 18819 | 4.9% |
| Other values (10) | 80562 |
Common
| Value | Count | Frequency (%) |
| 11475 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 395783 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| i | 41951 | |
| o | 41951 | |
| r | 36446 | 9.2% |
| e | 35265 | 8.9% |
| n | 34119 | 8.6% |
| a | 31440 | 7.9% |
| s | 26117 | 6.6% |
| W | 18819 | 4.8% |
| k | 18819 | 4.8% |
| g | 18819 | 4.8% |
| Other values (11) | 92037 |
NAME_EDUCATION_TYPE
Categorical
Imbalance 
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.8 MiB |
| Secondary / secondary special | |
|---|---|
| Higher education | |
| Incomplete higher | 1410 |
| Lower secondary | 374 |
| Academic degree | 32 |
Length
| Max length | 29 |
|---|---|
| Median length | 29 |
| Mean length | 24.862633 |
| Min length | 15 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Higher education |
|---|---|
| 2nd row | Higher education |
| 3rd row | Secondary / secondary special |
| 4th row | Secondary / secondary special |
| 5th row | Secondary / secondary special |
Common Values
| Value | Count | Frequency (%) |
| Secondary / secondary special | 24777 | |
| Higher education | 9864 | 27.1% |
| Incomplete higher | 1410 | 3.9% |
| Lower secondary | 374 | 1.0% |
| Academic degree | 32 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| secondary | 49928 | |
| 24777 | ||
| special | 24777 | |
| higher | 11274 | 9.2% |
| education | 9864 | 8.1% |
| incomplete | 1410 | 1.2% |
| lower | 374 | 0.3% |
| academic | 32 | < 0.1% |
| degree | 32 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 99165 | |
| c | 86043 | |
| 86011 | ||
| a | 84601 | |
| r | 61608 | 6.8% |
| o | 61576 | 6.8% |
| n | 61202 | 6.8% |
| d | 59856 | 6.6% |
| y | 49928 | 5.5% |
| s | 49928 | 5.5% |
| Other values (15) | 206499 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 759172 | |
| Space Separator | 86011 | 9.5% |
| Uppercase Letter | 36457 | 4.0% |
| Other Punctuation | 24777 | 2.7% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 99165 | |
| c | 86043 | |
| a | 84601 | |
| r | 61608 | |
| o | 61576 | |
| n | 61202 | |
| d | 59856 | |
| y | 49928 | |
| s | 49928 | |
| i | 45947 | |
| Other values (8) | 99318 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 24777 | |
| H | 9864 | 27.1% |
| I | 1410 | 3.9% |
| L | 374 | 1.0% |
| A | 32 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 86011 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 24777 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 795629 | |
| Common | 110788 | 12.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 99165 | |
| c | 86043 | |
| a | 84601 | |
| r | 61608 | |
| o | 61576 | |
| n | 61202 | |
| d | 59856 | |
| y | 49928 | 6.3% |
| s | 49928 | 6.3% |
| i | 45947 | 5.8% |
| Other values (13) | 135775 |
Common
| Value | Count | Frequency (%) |
| 86011 | ||
| / | 24777 | 22.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 906417 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 99165 | |
| c | 86043 | |
| 86011 | ||
| a | 84601 | |
| r | 61608 | 6.8% |
| o | 61576 | 6.8% |
| n | 61202 | 6.8% |
| d | 59856 | 6.6% |
| y | 49928 | 5.5% |
| s | 49928 | 5.5% |
| Other values (15) | 206499 |
NAME_FAMILY_STATUS
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.3 MiB |
| Married | |
|---|---|
| Single / not married | |
| Civil marriage | |
| Separated | 2103 |
| Widow | 1532 |
Length
| Max length | 20 |
|---|---|
| Median length | 7 |
| Mean length | 9.3187317 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Civil marriage |
|---|---|
| 2nd row | Civil marriage |
| 3rd row | Married |
| 4th row | Single / not married |
| 5th row | Single / not married |
Common Values
| Value | Count | Frequency (%) |
| Married | 25048 | |
| Single / not married | 4829 | 13.2% |
| Civil marriage | 2945 | 8.1% |
| Separated | 2103 | 5.8% |
| Widow | 1532 | 4.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| married | 29877 | |
| single | 4829 | 9.0% |
| 4829 | 9.0% | |
| not | 4829 | 9.0% |
| civil | 2945 | 5.5% |
| marriage | 2945 | 5.5% |
| separated | 2103 | 3.9% |
| widow | 1532 | 2.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| r | 67747 | |
| i | 45073 | |
| e | 41857 | |
| a | 39973 | |
| d | 33512 | |
| M | 25048 | 7.4% |
| 17432 | 5.1% | |
| n | 9658 | 2.8% |
| g | 7774 | 2.3% |
| l | 7774 | 2.3% |
| Other values (10) | 43885 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 281015 | |
| Uppercase Letter | 36457 | 10.7% |
| Space Separator | 17432 | 5.1% |
| Other Punctuation | 4829 | 1.4% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 67747 | |
| i | 45073 | |
| e | 41857 | |
| a | 39973 | |
| d | 33512 | |
| n | 9658 | 3.4% |
| g | 7774 | 2.8% |
| l | 7774 | 2.8% |
| m | 7774 | 2.8% |
| t | 6932 | 2.5% |
| Other values (4) | 12941 | 4.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 25048 | |
| S | 6932 | 19.0% |
| C | 2945 | 8.1% |
| W | 1532 | 4.2% |
Space Separator
| Value | Count | Frequency (%) |
| 17432 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 4829 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 317472 | |
| Common | 22261 | 6.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 67747 | |
| i | 45073 | |
| e | 41857 | |
| a | 39973 | |
| d | 33512 | |
| M | 25048 | 7.9% |
| n | 9658 | 3.0% |
| g | 7774 | 2.4% |
| l | 7774 | 2.4% |
| m | 7774 | 2.4% |
| Other values (8) | 31282 |
Common
| Value | Count | Frequency (%) |
| 17432 | ||
| / | 4829 | 21.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 339733 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| r | 67747 | |
| i | 45073 | |
| e | 41857 | |
| a | 39973 | |
| d | 33512 | |
| M | 25048 | 7.4% |
| 17432 | 5.1% | |
| n | 9658 | 2.8% |
| g | 7774 | 2.3% |
| l | 7774 | 2.3% |
| Other values (10) | 43885 |
NAME_HOUSING_TYPE
Categorical
Imbalance 
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.6 MiB |
| House / apartment | |
|---|---|
| With parents | 1776 |
| Municipal apartment | 1128 |
| Rented apartment | 575 |
| Office apartment | 262 |
Length
| Max length | 19 |
|---|---|
| Median length | 17 |
| Mean length | 16.786132 |
| Min length | 12 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Rented apartment |
|---|---|
| 2nd row | Rented apartment |
| 3rd row | House / apartment |
| 4th row | House / apartment |
| 5th row | House / apartment |
Common Values
| Value | Count | Frequency (%) |
| House / apartment | 32548 | |
| With parents | 1776 | 4.9% |
| Municipal apartment | 1128 | 3.1% |
| Rented apartment | 575 | 1.6% |
| Office apartment | 262 | 0.7% |
| Co-op apartment | 168 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| apartment | 34681 | |
| house | 32548 | |
| 32548 | ||
| with | 1776 | 1.7% |
| parents | 1776 | 1.7% |
| municipal | 1128 | 1.1% |
| rented | 575 | 0.5% |
| office | 262 | 0.2% |
| co-op | 168 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 73489 | |
| a | 72266 | |
| e | 70417 | |
| 69005 | ||
| n | 38160 | 6.2% |
| p | 37753 | 6.2% |
| r | 36457 | 6.0% |
| m | 34681 | 5.7% |
| s | 34324 | 5.6% |
| u | 33676 | 5.5% |
| Other values (15) | 111744 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 473794 | |
| Space Separator | 69005 | 11.3% |
| Uppercase Letter | 36457 | 6.0% |
| Other Punctuation | 32548 | 5.3% |
| Dash Punctuation | 168 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 73489 | |
| a | 72266 | |
| e | 70417 | |
| n | 38160 | |
| p | 37753 | |
| r | 36457 | |
| m | 34681 | |
| s | 34324 | |
| u | 33676 | |
| o | 32884 | |
| Other values (6) | 9687 | 2.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 32548 | |
| W | 1776 | 4.9% |
| M | 1128 | 3.1% |
| R | 575 | 1.6% |
| O | 262 | 0.7% |
| C | 168 | 0.5% |
Space Separator
| Value | Count | Frequency (%) |
| 69005 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 32548 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 168 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 510251 | |
| Common | 101721 | 16.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 73489 | |
| a | 72266 | |
| e | 70417 | |
| n | 38160 | |
| p | 37753 | |
| r | 36457 | |
| m | 34681 | |
| s | 34324 | |
| u | 33676 | |
| o | 32884 | |
| Other values (12) | 46144 |
Common
| Value | Count | Frequency (%) |
| 69005 | ||
| / | 32548 | |
| - | 168 | 0.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 611972 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 73489 | |
| a | 72266 | |
| e | 70417 | |
| 69005 | ||
| n | 38160 | 6.2% |
| p | 37753 | 6.2% |
| r | 36457 | 6.0% |
| m | 34681 | 5.7% |
| s | 34324 | 5.6% |
| u | 33676 | 5.5% |
| Other values (15) | 111744 |
DAYS_BIRTH
Real number (ℝ)
| Distinct | 7183 |
|---|---|
| Distinct (%) | 19.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -15975.173 |
| Minimum | -25152 |
|---|---|
| Maximum | -7489 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 36457 |
| Negative (%) | 100.0% |
| Memory size | 284.9 KiB |
Quantile statistics
| Minimum | -25152 |
|---|---|
| 5-th percentile | -23019 |
| Q1 | -19438 |
| median | -15563 |
| Q3 | -12462 |
| 95-th percentile | -9874 |
| Maximum | -7489 |
| Range | 17663 |
| Interquartile range (IQR) | 6976 |
Descriptive statistics
| Standard deviation | 4200.5499 |
|---|---|
| Coefficient of variation (CV) | -0.26294237 |
| Kurtosis | -1.0456436 |
| Mean | -15975.173 |
| Median Absolute Deviation (MAD) | 3425 |
| Skewness | -0.18422965 |
| Sum | -5.824069 × 108 |
| Variance | 17644620 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -12676 | 54 | 0.1% |
| -15519 | 54 | 0.1% |
| -16896 | 38 | 0.1% |
| -14667 | 37 | 0.1% |
| -15140 | 32 | 0.1% |
| -16768 | 32 | 0.1% |
| -15675 | 32 | 0.1% |
| -14136 | 30 | 0.1% |
| -13788 | 30 | 0.1% |
| -10182 | 29 | 0.1% |
| Other values (7173) | 36089 |
| Value | Count | Frequency (%) |
| -25152 | 2 | |
| -25140 | 3 | |
| -25099 | 1 | < 0.1% |
| -25088 | 1 | < 0.1% |
| -25010 | 2 | |
| -24970 | 2 | |
| -24963 | 1 | < 0.1% |
| -24946 | 3 | |
| -24932 | 4 | |
| -24914 | 3 |
| Value | Count | Frequency (%) |
| -7489 | 1 | < 0.1% |
| -7705 | 1 | < 0.1% |
| -7723 | 2 | |
| -7757 | 4 | |
| -7959 | 2 | |
| -7980 | 1 | < 0.1% |
| -8041 | 4 | |
| -8054 | 1 | < 0.1% |
| -8056 | 2 | |
| -8067 | 1 | < 0.1% |
DAYS_EMPLOYED
Real number (ℝ)
High correlation 
| Distinct | 3640 |
|---|---|
| Distinct (%) | 10.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59262.936 |
| Minimum | -15713 |
|---|---|
| Maximum | 365243 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 30322 |
| Negative (%) | 83.2% |
| Memory size | 284.9 KiB |
Quantile statistics
| Minimum | -15713 |
|---|---|
| 5-th percentile | -7205 |
| Q1 | -3153 |
| median | -1552 |
| Q3 | -408 |
| 95-th percentile | 365243 |
| Maximum | 365243 |
| Range | 380956 |
| Interquartile range (IQR) | 2745 |
Descriptive statistics
| Standard deviation | 137651.33 |
|---|---|
| Coefficient of variation (CV) | 2.3227222 |
| Kurtosis | 1.1433987 |
| Mean | 59262.936 |
| Median Absolute Deviation (MAD) | 1309 |
| Skewness | 1.7724432 |
| Sum | 2.1605488 × 109 |
| Variance | 1.894789 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 365243 | 6135 | 16.8% |
| -401 | 78 | 0.2% |
| -1539 | 64 | 0.2% |
| -200 | 63 | 0.2% |
| -1678 | 61 | 0.2% |
| -2087 | 61 | 0.2% |
| -2531 | 56 | 0.2% |
| -460 | 54 | 0.1% |
| -1160 | 53 | 0.1% |
| -2057 | 52 | 0.1% |
| Other values (3630) | 29780 |
| Value | Count | Frequency (%) |
| -15713 | 1 | < 0.1% |
| -15661 | 4 | < 0.1% |
| -15227 | 1 | < 0.1% |
| -15072 | 3 | < 0.1% |
| -15038 | 16 | |
| -14887 | 6 | < 0.1% |
| -14810 | 8 | |
| -14775 | 2 | < 0.1% |
| -14536 | 4 | < 0.1% |
| -14473 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 365243 | 6135 | |
| -17 | 3 | < 0.1% |
| -43 | 1 | < 0.1% |
| -65 | 2 | < 0.1% |
| -66 | 1 | < 0.1% |
| -70 | 4 | < 0.1% |
| -71 | 1 | < 0.1% |
| -73 | 17 | < 0.1% |
| -78 | 1 | < 0.1% |
| -79 | 1 | < 0.1% |
FLAG_MOBIL
Categorical
Constant 
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 1 |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 36457 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 36457 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 36457 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 36457 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 36457 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 36457 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 36457 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 36457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 36457 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 28235 | |
| 1 | 8222 | 22.6% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 28235 | |
| 1 | 8222 | 22.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 28235 | |
| 1 | 8222 | 22.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 36457 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 28235 | |
| 1 | 8222 | 22.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 36457 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 28235 | |
| 1 | 8222 | 22.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 36457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 28235 | |
| 1 | 8222 | 22.6% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 25709 | |
| 1 | 10748 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 25709 | |
| 1 | 10748 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 25709 | |
| 1 | 10748 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 36457 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 25709 | |
| 1 | 10748 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 36457 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 25709 | |
| 1 | 10748 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 36457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 25709 | |
| 1 | 10748 |
FLAG_EMAIL
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 0 | |
|---|---|
| 1 | 3271 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 33186 | |
| 1 | 3271 | 9.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 33186 | |
| 1 | 3271 | 9.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 33186 | |
| 1 | 3271 | 9.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 36457 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 33186 | |
| 1 | 3271 | 9.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 36457 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 33186 | |
| 1 | 3271 | 9.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 36457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 33186 | |
| 1 | 3271 | 9.0% |
OCCUPATION_TYPE
Categorical
High correlation  Missing 
| Distinct | 18 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 11323 |
| Missing (%) | 31.1% |
| Memory size | 2.3 MiB |
| Laborers | |
|---|---|
| Core staff | |
| Sales staff | |
| Managers | |
| Drivers | |
| Other values (13) |
Length
| Max length | 21 |
|---|---|
| Median length | 20 |
| Mean length | 10.535768 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Security staff |
|---|---|
| 2nd row | Sales staff |
| 3rd row | Sales staff |
| 4th row | Sales staff |
| 5th row | Sales staff |
Common Values
| Value | Count | Frequency (%) |
| Laborers | 6211 | |
| Core staff | 3591 | 9.8% |
| Sales staff | 3485 | 9.6% |
| Managers | 3012 | 8.3% |
| Drivers | 2138 | 5.9% |
| High skill tech staff | 1383 | 3.8% |
| Accountants | 1241 | 3.4% |
| Medicine staff | 1207 | 3.3% |
| Cooking staff | 655 | 1.8% |
| Security staff | 592 | 1.6% |
| Other values (8) | 1619 | 4.4% |
| (Missing) | 11323 |
Length
| Value | Count | Frequency (%) |
| staff | 12127 | |
| laborers | 6386 | |
| core | 3591 | 8.8% |
| sales | 3485 | 8.6% |
| managers | 3012 | 7.4% |
| drivers | 2138 | 5.3% |
| high | 1383 | 3.4% |
| skill | 1383 | 3.4% |
| tech | 1383 | 3.4% |
| accountants | 1241 | 3.1% |
| Other values (13) | 4496 | 11.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 30815 | |
| s | 30695 | |
| r | 25581 | |
| e | 25543 | |
| f | 24254 | 9.2% |
| t | 17411 | 6.6% |
| 15491 | 5.8% | |
| o | 12703 | 4.8% |
| i | 10304 | 3.9% |
| n | 8711 | 3.3% |
| Other values (26) | 63298 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 223512 | |
| Uppercase Letter | 25454 | 9.6% |
| Space Separator | 15491 | 5.8% |
| Dash Punctuation | 175 | 0.1% |
| Other Punctuation | 174 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 30815 | |
| s | 30695 | |
| r | 25581 | |
| e | 25543 | |
| f | 24254 | |
| t | 17411 | |
| o | 12703 | |
| i | 10304 | 4.6% |
| n | 8711 | 3.9% |
| l | 7231 | 3.2% |
| Other values (11) | 30264 |
Uppercase Letter
| Value | Count | Frequency (%) |
| L | 6561 | |
| C | 4797 | |
| S | 4228 | |
| M | 4219 | |
| D | 2138 | 8.4% |
| H | 1468 | 5.8% |
| A | 1241 | 4.9% |
| P | 344 | 1.4% |
| W | 174 | 0.7% |
| R | 164 | 0.6% |
| Other values (2) | 120 | 0.5% |
Space Separator
| Value | Count | Frequency (%) |
| 15491 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 175 |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 174 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 248966 | |
| Common | 15840 | 6.0% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 30815 | |
| s | 30695 | |
| r | 25581 | |
| e | 25543 | |
| f | 24254 | |
| t | 17411 | 7.0% |
| o | 12703 | 5.1% |
| i | 10304 | 4.1% |
| n | 8711 | 3.5% |
| l | 7231 | 2.9% |
| Other values (23) | 55718 |
Common
| Value | Count | Frequency (%) |
| 15491 | ||
| - | 175 | 1.1% |
| / | 174 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 264806 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 30815 | |
| s | 30695 | |
| r | 25581 | |
| e | 25543 | |
| f | 24254 | 9.2% |
| t | 17411 | 6.6% |
| 15491 | 5.8% | |
| o | 12703 | 4.8% |
| i | 10304 | 3.9% |
| n | 8711 | 3.3% |
| Other values (26) | 63298 |
CNT_FAM_MEMBERS
Real number (ℝ)
High correlation 
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.198453 |
| Minimum | 1 |
|---|---|
| Maximum | 20 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 4 |
| Maximum | 20 |
| Range | 19 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.91168614 |
|---|---|
| Coefficient of variation (CV) | 0.4146944 |
| Kurtosis | 8.1886954 |
| Mean | 2.198453 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.2985959 |
| Sum | 80149 |
| Variance | 0.83117162 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 19463 | |
| 1 | 6987 | 19.2% |
| 3 | 6421 | 17.6% |
| 4 | 3106 | 8.5% |
| 5 | 397 | 1.1% |
| 6 | 58 | 0.2% |
| 7 | 19 | 0.1% |
| 15 | 3 | < 0.1% |
| 9 | 2 | < 0.1% |
| 20 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 6987 | 19.2% |
| 2 | 19463 | |
| 3 | 6421 | 17.6% |
| 4 | 3106 | 8.5% |
| 5 | 397 | 1.1% |
| 6 | 58 | 0.2% |
| 7 | 19 | 0.1% |
| 9 | 2 | < 0.1% |
| 15 | 3 | < 0.1% |
| 20 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 20 | 1 | < 0.1% |
| 15 | 3 | < 0.1% |
| 9 | 2 | < 0.1% |
| 7 | 19 | 0.1% |
| 6 | 58 | 0.2% |
| 5 | 397 | 1.1% |
| 4 | 3106 | 8.5% |
| 3 | 6421 | 17.6% |
| 2 | 19463 | |
| 1 | 6987 | 19.2% |
account_age_months
Real number (ℝ)
| Distinct | 61 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.164193 |
| Minimum | 0 |
|---|---|
| Maximum | 60 |
| Zeros | 315 |
| Zeros (%) | 0.9% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 284.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 12 |
| median | 24 |
| Q3 | 39 |
| 95-th percentile | 55 |
| Maximum | 60 |
| Range | 60 |
| Interquartile range (IQR) | 27 |
Descriptive statistics
| Standard deviation | 16.501854 |
|---|---|
| Coefficient of variation (CV) | 0.63070373 |
| Kurtosis | -1.0377619 |
| Mean | 26.164193 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 0.28639457 |
| Sum | 953868 |
| Variance | 272.3112 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7 | 889 | 2.4% |
| 11 | 828 | 2.3% |
| 6 | 824 | 2.3% |
| 8 | 820 | 2.2% |
| 5 | 816 | 2.2% |
| 17 | 807 | 2.2% |
| 3 | 800 | 2.2% |
| 10 | 798 | 2.2% |
| 16 | 785 | 2.2% |
| 15 | 774 | 2.1% |
| Other values (51) | 28316 |
| Value | Count | Frequency (%) |
| 0 | 315 | 0.9% |
| 1 | 551 | |
| 2 | 643 | |
| 3 | 800 | |
| 4 | 765 | |
| 5 | 816 | |
| 6 | 824 | |
| 7 | 889 | |
| 8 | 820 | |
| 9 | 770 |
| Value | Count | Frequency (%) |
| 60 | 321 | |
| 59 | 307 | |
| 58 | 333 | |
| 57 | 304 | |
| 56 | 345 | |
| 55 | 368 | |
| 54 | 358 | |
| 53 | 377 | |
| 52 | 463 | |
| 51 | 476 |
is_high_risk
Categorical
Imbalance 
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| 0 | |
|---|---|
| 1 | 382 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 36075 | |
| 1 | 382 | 1.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 36075 | |
| 1 | 382 | 1.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 36075 | |
| 1 | 382 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 36457 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 36075 | |
| 1 | 382 | 1.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 36457 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 36075 | |
| 1 | 382 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 36457 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 36075 | |
| 1 | 382 | 1.0% |
Interactions
Correlations
| AMT_INCOME_TOTAL | CNT_CHILDREN | CNT_FAM_MEMBERS | CODE_GENDER | DAYS_BIRTH | DAYS_EMPLOYED | FLAG_EMAIL | FLAG_OWN_CAR | FLAG_OWN_REALTY | FLAG_PHONE | FLAG_WORK_PHONE | ID | NAME_EDUCATION_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | NAME_INCOME_TYPE | OCCUPATION_TYPE | account_age_months | is_high_risk | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AMT_INCOME_TOTAL | 1.000 | 0.044 | 0.022 | 0.201 | 0.095 | -0.163 | 0.086 | 0.205 | 0.041 | 0.047 | 0.040 | -0.021 | 0.109 | 0.033 | 0.054 | 0.098 | 0.112 | 0.028 | 0.000 |
| CNT_CHILDREN | 0.044 | 1.000 | 0.826 | 0.063 | 0.380 | -0.143 | 0.007 | 0.086 | 0.007 | 0.019 | 0.055 | 0.029 | 0.017 | 0.077 | 0.032 | 0.071 | 0.055 | 0.005 | 0.000 |
| CNT_FAM_MEMBERS | 0.022 | 0.826 | 1.000 | 0.104 | 0.306 | -0.147 | 0.030 | 0.118 | 0.019 | 0.023 | 0.055 | 0.027 | 0.031 | 0.154 | 0.066 | 0.120 | 0.057 | 0.026 | 0.000 |
| CODE_GENDER | 0.201 | 0.063 | 0.104 | 1.000 | 0.210 | 0.175 | 0.000 | 0.361 | 0.050 | 0.026 | 0.065 | 0.050 | 0.020 | 0.161 | 0.087 | 0.191 | 0.557 | 0.018 | 0.012 |
| DAYS_BIRTH | 0.095 | 0.380 | 0.306 | 0.210 | 1.000 | -0.209 | 0.110 | 0.168 | 0.134 | 0.066 | 0.197 | 0.056 | 0.126 | 0.166 | 0.113 | 0.377 | 0.098 | -0.053 | 0.003 |
| DAYS_EMPLOYED | -0.163 | -0.143 | -0.147 | 0.175 | -0.209 | 1.000 | 0.086 | 0.157 | 0.093 | 0.004 | 0.243 | -0.008 | 0.149 | 0.210 | 0.114 | 0.998 | 1.000 | -0.080 | 0.002 |
| FLAG_EMAIL | 0.086 | 0.007 | 0.030 | 0.000 | 0.110 | 0.086 | 1.000 | 0.021 | 0.052 | 0.009 | 0.034 | 0.164 | 0.100 | 0.031 | 0.029 | 0.110 | 0.091 | 0.013 | 0.016 |
| FLAG_OWN_CAR | 0.205 | 0.086 | 0.118 | 0.361 | 0.168 | 0.157 | 0.021 | 1.000 | 0.014 | 0.013 | 0.021 | 0.061 | 0.103 | 0.155 | 0.041 | 0.162 | 0.274 | 0.041 | 0.000 |
| FLAG_OWN_REALTY | 0.041 | 0.007 | 0.019 | 0.050 | 0.134 | 0.093 | 0.052 | 0.014 | 1.000 | 0.066 | 0.208 | 0.185 | 0.041 | 0.031 | 0.207 | 0.095 | 0.052 | 0.012 | 0.001 |
| FLAG_PHONE | 0.047 | 0.019 | 0.023 | 0.026 | 0.066 | 0.004 | 0.009 | 0.013 | 0.066 | 1.000 | 0.312 | 0.063 | 0.054 | 0.044 | 0.038 | 0.010 | 0.070 | 0.016 | 0.000 |
| FLAG_WORK_PHONE | 0.040 | 0.055 | 0.055 | 0.065 | 0.197 | 0.243 | 0.034 | 0.021 | 0.208 | 0.312 | 1.000 | 0.121 | 0.048 | 0.064 | 0.036 | 0.257 | 0.062 | 0.022 | 0.004 |
| ID | -0.021 | 0.029 | 0.027 | 0.050 | 0.056 | -0.008 | 0.164 | 0.061 | 0.185 | 0.063 | 0.121 | 1.000 | 0.042 | 0.043 | 0.033 | 0.047 | 0.067 | -0.001 | 0.016 |
| NAME_EDUCATION_TYPE | 0.109 | 0.017 | 0.031 | 0.020 | 0.126 | 0.149 | 0.100 | 0.103 | 0.041 | 0.054 | 0.048 | 0.042 | 1.000 | 0.045 | 0.052 | 0.100 | 0.206 | 0.014 | 0.006 |
| NAME_FAMILY_STATUS | 0.033 | 0.077 | 0.154 | 0.161 | 0.166 | 0.210 | 0.031 | 0.155 | 0.031 | 0.044 | 0.064 | 0.043 | 0.045 | 1.000 | 0.056 | 0.108 | 0.105 | 0.030 | 0.001 |
| NAME_HOUSING_TYPE | 0.054 | 0.032 | 0.066 | 0.087 | 0.113 | 0.114 | 0.029 | 0.041 | 0.207 | 0.038 | 0.036 | 0.033 | 0.052 | 0.056 | 1.000 | 0.062 | 0.071 | 0.014 | 0.000 |
| NAME_INCOME_TYPE | 0.098 | 0.071 | 0.120 | 0.191 | 0.377 | 0.998 | 0.110 | 0.162 | 0.095 | 0.010 | 0.257 | 0.047 | 0.100 | 0.108 | 0.062 | 1.000 | 0.180 | 0.013 | 0.010 |
| OCCUPATION_TYPE | 0.112 | 0.055 | 0.057 | 0.557 | 0.098 | 1.000 | 0.091 | 0.274 | 0.052 | 0.070 | 0.062 | 0.067 | 0.206 | 0.105 | 0.071 | 0.180 | 1.000 | 0.024 | 0.042 |
| account_age_months | 0.028 | 0.005 | 0.026 | 0.018 | -0.053 | -0.080 | 0.013 | 0.041 | 0.012 | 0.016 | 0.022 | -0.001 | 0.014 | 0.030 | 0.014 | 0.013 | 0.024 | 1.000 | 0.063 |
| is_high_risk | 0.000 | 0.000 | 0.000 | 0.012 | 0.003 | 0.002 | 0.016 | 0.000 | 0.001 | 0.000 | 0.004 | 0.016 | 0.006 | 0.001 | 0.000 | 0.010 | 0.042 | 0.063 | 1.000 |
Missing values
Sample
| ID | CODE_GENDER | FLAG_OWN_CAR | FLAG_OWN_REALTY | CNT_CHILDREN | AMT_INCOME_TOTAL | NAME_INCOME_TYPE | NAME_EDUCATION_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | DAYS_BIRTH | DAYS_EMPLOYED | FLAG_MOBIL | FLAG_WORK_PHONE | FLAG_PHONE | FLAG_EMAIL | OCCUPATION_TYPE | CNT_FAM_MEMBERS | account_age_months | is_high_risk | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5008804 | M | Y | Y | 0 | 427500.0 | Working | Higher education | Civil marriage | Rented apartment | -12005 | -4542 | 1 | 1 | 0 | 0 | NaN | 2.0 | 15 | 0 |
| 1 | 5008805 | M | Y | Y | 0 | 427500.0 | Working | Higher education | Civil marriage | Rented apartment | -12005 | -4542 | 1 | 1 | 0 | 0 | NaN | 2.0 | 14 | 0 |
| 2 | 5008806 | M | Y | Y | 0 | 112500.0 | Working | Secondary / secondary special | Married | House / apartment | -21474 | -1134 | 1 | 0 | 0 | 0 | Security staff | 2.0 | 29 | 0 |
| 3 | 5008808 | F | N | Y | 0 | 270000.0 | Commercial associate | Secondary / secondary special | Single / not married | House / apartment | -19110 | -3051 | 1 | 0 | 1 | 1 | Sales staff | 1.0 | 4 | 0 |
| 4 | 5008809 | F | N | Y | 0 | 270000.0 | Commercial associate | Secondary / secondary special | Single / not married | House / apartment | -19110 | -3051 | 1 | 0 | 1 | 1 | Sales staff | 1.0 | 26 | 0 |
| 5 | 5008810 | F | N | Y | 0 | 270000.0 | Commercial associate | Secondary / secondary special | Single / not married | House / apartment | -19110 | -3051 | 1 | 0 | 1 | 1 | Sales staff | 1.0 | 26 | 0 |
| 6 | 5008811 | F | N | Y | 0 | 270000.0 | Commercial associate | Secondary / secondary special | Single / not married | House / apartment | -19110 | -3051 | 1 | 0 | 1 | 1 | Sales staff | 1.0 | 38 | 0 |
| 7 | 5008812 | F | N | Y | 0 | 283500.0 | Pensioner | Higher education | Separated | House / apartment | -22464 | 365243 | 1 | 0 | 0 | 0 | NaN | 1.0 | 20 | 0 |
| 8 | 5008813 | F | N | Y | 0 | 283500.0 | Pensioner | Higher education | Separated | House / apartment | -22464 | 365243 | 1 | 0 | 0 | 0 | NaN | 1.0 | 16 | 0 |
| 9 | 5008814 | F | N | Y | 0 | 283500.0 | Pensioner | Higher education | Separated | House / apartment | -22464 | 365243 | 1 | 0 | 0 | 0 | NaN | 1.0 | 17 | 0 |
| ID | CODE_GENDER | FLAG_OWN_CAR | FLAG_OWN_REALTY | CNT_CHILDREN | AMT_INCOME_TOTAL | NAME_INCOME_TYPE | NAME_EDUCATION_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | DAYS_BIRTH | DAYS_EMPLOYED | FLAG_MOBIL | FLAG_WORK_PHONE | FLAG_PHONE | FLAG_EMAIL | OCCUPATION_TYPE | CNT_FAM_MEMBERS | account_age_months | is_high_risk | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 36447 | 5149145 | M | Y | Y | 0 | 247500.0 | Working | Secondary / secondary special | Married | House / apartment | -10952 | -3577 | 1 | 1 | 0 | 0 | Laborers | 2.0 | 25 | 0 |
| 36448 | 5149158 | M | Y | Y | 0 | 247500.0 | Working | Secondary / secondary special | Married | House / apartment | -10952 | -3577 | 1 | 1 | 0 | 0 | Laborers | 2.0 | 28 | 0 |
| 36449 | 5149190 | M | Y | N | 1 | 450000.0 | Working | Higher education | Married | House / apartment | -9847 | -502 | 1 | 0 | 1 | 1 | Core staff | 3.0 | 11 | 1 |
| 36450 | 5149729 | M | Y | Y | 0 | 90000.0 | Working | Secondary / secondary special | Married | House / apartment | -19101 | -1721 | 1 | 0 | 0 | 0 | NaN | 2.0 | 21 | 0 |
| 36451 | 5149775 | F | Y | Y | 0 | 130500.0 | Working | Secondary / secondary special | Married | House / apartment | -16137 | -9391 | 1 | 0 | 1 | 0 | Laborers | 2.0 | 19 | 0 |
| 36452 | 5149828 | M | Y | Y | 0 | 315000.0 | Working | Secondary / secondary special | Married | House / apartment | -17348 | -2420 | 1 | 0 | 0 | 0 | Managers | 2.0 | 11 | 1 |
| 36453 | 5149834 | F | N | Y | 0 | 157500.0 | Commercial associate | Higher education | Married | House / apartment | -12387 | -1325 | 1 | 0 | 1 | 1 | Medicine staff | 2.0 | 23 | 0 |
| 36454 | 5149838 | F | N | Y | 0 | 157500.0 | Pensioner | Higher education | Married | House / apartment | -12387 | -1325 | 1 | 0 | 1 | 1 | Medicine staff | 2.0 | 32 | 0 |
| 36455 | 5150049 | F | N | Y | 0 | 283500.0 | Working | Secondary / secondary special | Married | House / apartment | -17958 | -655 | 1 | 0 | 0 | 0 | Sales staff | 2.0 | 9 | 1 |
| 36456 | 5150337 | M | N | Y | 0 | 112500.0 | Working | Secondary / secondary special | Single / not married | Rented apartment | -9188 | -1193 | 1 | 0 | 0 | 0 | Laborers | 1.0 | 13 | 0 |